Skip to content

Add support for per-chapter remarks#408

Open
pmachapman wants to merge 6 commits into
masterfrom
per_chapter_remarks
Open

Add support for per-chapter remarks#408
pmachapman wants to merge 6 commits into
masterfrom
per_chapter_remarks

Conversation

@pmachapman
Copy link
Copy Markdown
Collaborator

@pmachapman pmachapman commented Apr 16, 2026

Part of sillsdev/serval#905


This change is Reviewable

@pmachapman pmachapman requested review from Enkidu93 and ddaspit April 16, 2026 02:41
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 16, 2026

Codecov Report

❌ Patch coverage is 93.33333% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 73.11%. Comparing base (89406ca) to head (eb42d8d).

Files with missing lines Patch % Lines
src/SIL.Machine/Corpora/UsfmParser.cs 69.23% 3 Missing and 1 partial ⚠️
src/SIL.Machine/Corpora/UpdateUsfmParserHandler.cs 95.23% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #408      +/-   ##
==========================================
+ Coverage   73.07%   73.11%   +0.04%     
==========================================
  Files         439      439              
  Lines       36712    36787      +75     
  Branches     5043     5058      +15     
==========================================
+ Hits        26828    26898      +70     
- Misses       8773     8777       +4     
- Partials     1111     1112       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Collaborator

@Enkidu93 Enkidu93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this work relate to this PR Matthew is working on: sillsdev/machine.py#285?

@Enkidu93 reviewed 3 files and all commit messages, and made 1 comment.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on ddaspit).

Copy link
Copy Markdown
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ddaspit reviewed 3 files and all commit messages, and made 1 comment.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on pmachapman).


src/SIL.Machine/Corpora/UpdateUsfmParserHandler.cs line 468 at r1 (raw file):

                        {
                            // Add the remarks just after the specified chapter,
                            // skipping any alternate and published chapter numbers

Unless I'm missing something, I don't think this will skip \ca and \cp markers.

@pmachapman pmachapman force-pushed the per_chapter_remarks branch 2 times, most recently from c9711e8 to 5acb2fd Compare April 19, 2026 21:00
Copy link
Copy Markdown
Collaborator Author

@pmachapman pmachapman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Enkidu93 I was unaware of Matthew's PR (sorry that is on me, I forgot to look). It is a little different - this PR keeps support for book level remarks, and allows different remarks per chapter. I've commented on his PR, and am happy to change this PR to match his if my implementation is incorrect (as I couldn't find concrete specs I got to this PR via an hour of iteration and testing).

@pmachapman made 2 comments.
Reviewable status: 2 of 3 files reviewed, 1 unresolved discussion (waiting on ddaspit and Enkidu93).


src/SIL.Machine/Corpora/UpdateUsfmParserHandler.cs line 468 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

Unless I'm missing something, I don't think this will skip \ca and \cp markers.

Done. That was a hangover from a previous iteration of this change.

@pmachapman pmachapman force-pushed the per_chapter_remarks branch 2 times, most recently from 4ea8c05 to e234f23 Compare April 19, 2026 21:17
Copy link
Copy Markdown
Collaborator

@Enkidu93 Enkidu93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries. I think we'll probably want some of the changes from both PRs. We'll want your per-chapter remarks and his support for partial USFM. It's probably best to pick machine or machine.py and make all the changes there and then port. I'll leave it to you and Matthew to figure out what makes the most sense in regard to where and who.

@Enkidu93 reviewed 1 file and all commit messages, and made 1 comment.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on ddaspit).

Copy link
Copy Markdown
Collaborator Author

@pmachapman pmachapman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll want your per-chapter remarks and his support for partial USFM

I have ported the partial USFM support from Matthew's PR.

@pmachapman made 1 comment.
Reviewable status: 1 of 4 files reviewed, 1 unresolved discussion (waiting on ddaspit and Enkidu93).

Copy link
Copy Markdown
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ddaspit reviewed 4 files and all commit messages, made 1 comment, and resolved 1 discussion.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on pmachapman).


src/SIL.Machine/Corpora/UsfmTokenizer.cs line 43 at r3 (raw file):

            string usfm,
            bool preserveWhitespace = false,
            IReadOnlyList<int> filterTokensByChapter = null

I don't want to mess with UsfmTokenizer. I would prefer to perform the filtering in ParatextProjectTextUpdaterBase.

Copy link
Copy Markdown
Collaborator Author

@pmachapman pmachapman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pmachapman made 1 comment.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on ddaspit).


src/SIL.Machine/Corpora/UsfmTokenizer.cs line 43 at r3 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

I don't want to mess with UsfmTokenizer. I would prefer to perform the filtering in ParatextProjectTextUpdaterBase.

I have moved FilterTokensByChapter to ParatextProjectTextUpdaterBase, and made it internal so the unit tests that require it can used it.

Copy link
Copy Markdown
Collaborator

@Enkidu93 Enkidu93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Enkidu93 reviewed 4 files and all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on ddaspit).

Copy link
Copy Markdown
Collaborator

@Enkidu93 Enkidu93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Enkidu93 made 1 comment.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on ddaspit).


src/SIL.Machine/Corpora/UsfmTokenizer.cs line 43 at r3 (raw file):

Previously, pmachapman (Peter Chapman) wrote…

I have moved FilterTokensByChapter to ParatextProjectTextUpdaterBase, and made it internal so the unit tests that require it can used it.

I put a comment in Matthew's PR regarding this. If he refactors there to use a Memory... updater implementation, you'll want to update accordingly.

Copy link
Copy Markdown
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

@ddaspit reviewed 4 files and all commit messages, made 1 comment, and resolved 1 discussion.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on pmachapman).

@pmachapman pmachapman force-pushed the per_chapter_remarks branch from d164ee4 to c3c5ca6 Compare May 10, 2026 20:02
@pmachapman pmachapman requested review from Enkidu93 and ddaspit May 10, 2026 22:06
Copy link
Copy Markdown
Collaborator Author

@pmachapman pmachapman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ddaspit @Enkidu93 I'm requesting a re-review as I have not only bought this branch into sync with @mshannon-sil 's work in machine.py, but I have also done some clean up:

  • Refactored DefaultParatextProjectSettings to its own file and made it no longer a nested class
  • Did some clean up work in UpdateUsfmParserHandlerTests to use modern C# and NUnit features added in the dotnet 10 upgrade

If you don't like the clean up, or want that in a separate PR, I have kept those changes to their own commits which I can drop/move as you please.

@pmachapman made 2 comments.
Reviewable status: 2 of 13 files reviewed, all discussions resolved (waiting on ddaspit and Enkidu93).


src/SIL.Machine/Corpora/UsfmTokenizer.cs line 43 at r3 (raw file):

Previously, Enkidu93 (Eli C. Lowry) wrote…

I put a comment in Matthew's PR regarding this. If he refactors there to use a Memory... updater implementation, you'll want to update accordingly.

Done.

Copy link
Copy Markdown
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

@ddaspit reviewed 11 files and all commit messages, and made 1 comment.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on Enkidu93).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants